Modified interferon alpha with reduced immunogenicity

ABSTRACT

The present invention relates to polypeptides to be administered especially to humans and in particular for therapeutic use. The polypeptides are modified polypeptides whereby the modification results in a reduced propensity for the polypeptide to elicit an immune response upon administration to the human subject. The invention in particular to the modification of human interferon alpha and specifically interferon alpha 2(INFα2) to result in proteins that are substantially non-immunogenic or less immunogenic than any non-modified counterpart when use in vivo.

FIELD OF THE INVENTION

The present invention relates to polypeptides to be administeredespecially to humans and in particular for therapeutic use. Thepolypeptides are modified polypeptides whereby the modification resultsin a reduced propensity for the polypeptide to elicit an immune responseupon administration to the human subject. The invention in particularrelates to the modification of human interferon and specifically humaninterferon α2 (INFα2) to result in INFα2 protein variants that aresubstantially non-immunogenic or less immunogenic than any non-modifiedcounterpart when used in vivo. The invention relates furthermore toT-cell epitope peptides derived from said non-modified protein by meansof which it is possible to create modified INFα2 variants with reducedimmunogenicity.

BACKGROUND OF THE INVENTION

There are many instances whereby the efficacy of a therapeutic proteinis limited by an unwanted immune reaction to the therapeutic protein.Several mouse monoclonal antibodies have shown promise as therapies in anumber of human disease settings but in certain cases have failed due tothe induction of significant degrees of a human anti-murine antibody(HAMA) response [Schroff, R. W. et al (1985) Cancer Res. 45: 879-885;Shawler, D. L. et al (1985) J. Immunol. 135: 1530-1535]. For monoclonalantibodies, a number of techniques have been developed in attempt toreduce the HAMA response [WO 89/09622; EP 0239400; EP 0438310; WO91/06667]. These recombinant DNA approaches have generally reduced themouse genetic information in the final antibody construct whilstincreasing the human genetic information in the final construct.Notwithstanding, the resultant “humanized” antibodies have, in severalcases, still elicited an immune response in patients [Issacs J. D.(1990) Sem. Immunol. 2: 449, 456; Rebello, P. R. et al (1999)Transplantation 68: 1417-1420].

Antibodies are not the only class of polypeptide molecule administeredas a therapeutic agent against which an immune response may be mounted.Even proteins of human origin and with the same amino acid sequences asoccur within humans can still induce an immune response in humans.Notable examples include the therapeutic use of granulocyte-macrophagecolony stimulating factor [Wadhwa, M. et al (1999) Clin. Cancer Res. 5:1353-1361] and INFα2 [Russo, D. et al (1996) Bri. J. Haem. 94: 300-305;Stein, R. et al (1988) New Engl. J. Med. 318: 1409-1413] amongst others.

A principal factor in the induction of an immune response is thepresence within the protein of peptides that can stimulate the activityof T-cells via presentation on MHC class II molecules, so-called “T-cellepitopes”. Such potential T-cell epitopes are commonly defined as anyamino acid residue sequence with the ability to bind to MHC Class IImolecules. Such T-cell epitopes can be measured to establish MHCbinding. Implicitly, a “T-cell epitope” means an epitope which whenbound to MHC molecules can be recognized by a T-cell receptor (TCR), andwhich can, at least in principle, cause the activation of these T-cellsby engaging a TCR to promote a T-cell response. It is, however, usuallyunderstood that certain peptides which are found to bind to MHC Class IImolecules may be retained in a protein sequence because such peptidesare recognized as “self” within the organism into which the finalprotein is administered.

It is known, that certain of these T-cell epitope peptides can bereleased during the degradation of peptides, polypeptides or proteinswithin cells and subsequently be presented by molecules of the majorhistocompatability complex (MHC) in order to trigger the activation ofT-cells. For peptides presented by MHC Class II, such activation ofT-cells can then give rise, for example, to an antibody response bydirect stimulation of B-cells to produce such antibodies.

MHC Class II molecules are a group of highly polymorphic proteins whichplay a central role in helper T-cell selection and activation. The humanleukocyte antigen group DR (HLA-DR) are the predominant isotype of thisgroup of proteins and are the major focus of the present invention.However, isotypes HLA-DQ and HLA-DP perform similar functions, hence thepresent invention is equally applicable to these. The MHC class II DRmolecule is made of an alpha and a beta chain which insert at theirC-termini through the cell membrane. Each hetero-dimer possesses aligand binding domain which binds to peptides varying between 9 and 20amino acids in length, although the binding groove can accommodate amaximum of 11 amino acids. The ligand binding domain is comprised ofamino acids 1 to 85 of the alpha chain, and amino acids 1 to 94 of thebeta chain. DQ molecules have recently been shown to have an homologousstructure and the DP family proteins are also expected to be verysimilar. In humans approximately 70 different allotypes of the DRisotype are known, for DQ there are 30 different allotypes and for DP 47different allotypes are known. Each individual bears two to four DRalleles, two DQ and two DP alleles. The structure of a number of DRmolecules has been solved and such structures point to an open-endedpeptide binding groove with a number of hydrophobic pockets which engagehydrophobic residues (pocket residues) of the peptide [Brown et alNature (1993) 364: 33; Stern et al (1994) Nature 368: 215]. Polymorphismidentifying the different allotypes of class II molecule contributes toa wide diversity of different binding surfaces for peptides within thepeptide binding grove and at the population level ensures maximalflexibility with regard to the ability to recognize foreign proteins andmount an immune response to pathogenic organisms. There is aconsiderable amount of polymorphism within the ligand binding domainwith distinct “families” within different geographical populations andethnic groups. This polymorphism affects the binding characteristics ofthe peptide binding domain, thus different “families” of DR moleculeswill have specificities for peptides with different sequence properties,although there may be some overlap. This specificity determinesrecognition of Th-cell epitopes (Class II T-cell response) which areultimately responsible for driving the antibody response to B-cellepitopes present on the same protein from which the Th-cell epitope isderived. Thus, the immune response to a protein in an individual isheavily influenced by T-cell epitope recognition which is a function ofthe peptide binding specificity of that individual's HLA-DR allotype.Therefore, in order to identify T-cell epitopes within a protein orpeptide in the context of a global population, it is desirable toconsider the binding properties of as diverse a set of HLA-DR allotypesas possible, thus covering as high a percentage of the world populationas possible.

An immune response to a therapeutic protein such as the protein which isobject of this invention, proceeds via the MHC class II peptidepresentation pathway. Here exogenous proteins are engulfed and processedfor presentation in association with MHC class II molecules of the DR,DQ or DP type. MHC Class II molecules are expressed by professionalantigen presenting cells (APCs), such as macrophages and dendritic cellsamongst others. Engagement of a MHC class II peptide complex by acognate T-cell receptor on the surface of the T-cell, together with thecross-binding of certain other co-receptors such as the CD4 molecule,can induce an activated state within the T-cell.

Activation leads to the release of cytokines further activating otherlymphocytes such as B cells to produce antibodies or activating T killercells as a full cellular immune response. The ability of a peptide tobind a given MHC class II molecule for presentation on the surface of anAPC is dependent on a number of factors most notably its primarysequence. This will influence both its propensity for proteolyticcleavage and also its affinity for binding within the peptide bindingcleft of the MHC class II molecule. The MHC class II/peptide complex onthe APC surface presents a binding face to a particular T-cell receptor(TCR) able to recognize determinants provided both by exposed residuesof the peptide and the MHC class II molecule.

In the art there are procedures for identifying synthetic peptides ableto bind MHC class II molecules (e.g. WO98/52976 and WO00/34317). Suchpeptides may not function as T-cell epitopes in all situations,particularly, in vivo due to the processing pathways or other phenomena.T-cell epitope identification is the first step to epitope elimination.The identification and removal of potential T-cell epitopes fromproteins has been previously disclosed. In the art methods have beenprovided to enable the detection of T-cell epitopes usually bycomputational means scanning for recognized sequence motifs inexperimentally determined T-cell epitopes or alternatively usingcomputational techniques to predict MHC class II-binding peptides and inparticular DR-binding peptides.

WO98/52976 and WO00/34317 teach computational threading approaches toidentifying polypeptide sequences with the potential to bind a sub-setof human MHC class II DR allotypes. In these teachings, predicted T-cellepitopes are removed by the use of judicious amino acid substitutionwithin the primary sequence of the therapeutic antibody or non-antibodyprotein of both non-human and human derivation.

Other techniques exploiting soluble complexes of recombinant MHCmolecules in combination with synthetic peptides and able to bind toT-cell clones from peripheral blood samples from human or experimentalanimal subjects have been used in the art [Kern, F. et al (1998) NatureMedicine 4:975-978; Kwok, W. W. et al (2001) TRENDS in Immunology 22:583-588] and may also be exploited in an epitope identificationstrategy.

As depicted above and as consequence thereof, it would be desirable toidentify and to remove or at least to reduce T-cell epitopes from agiven in principal therapeutically valuable but originally immunogenicpeptide, polypeptide or protein.

One of these therapeutically valuable molecules is INFα2. The moleculeis an important glycoprotein cytokine expressed by activatedmacrophages. The protein has antiviral activity and stimulates theproduction of at least two enzymes; a protein kinase and anoligoadenylate synthetase, on binding to the interferon alpha receptorin expressing cells. The mature INFα2 protein is single polypeptide of165 amino acids produced by post-translational processing of a 188 aminoacid pre-cursor protein by cleavage of a 23 amino acid signal sequencefrom the amino terminus. Several different subtypes of human INFα2 areknown showing minor differences between primary amino acid sequences.Thus INFα2a and INFα2b differ in only one residue at position 23 of themature protein chain being lysine in INFα2a and arginine in INFα2b.Whilst the disclosures of the present invention are directed towards thesequence of INFα2b, it can be seen that for all practical purposes thesequence of INFα2a may be considered interchangeably with the subjectINFα2b subtype of the present invention. The amino acid sequence ofINFα2(a,b) (depicted as one-letter code) is as follows:CDLPQTHSLGSRRTLMLLAQMR (R,K) ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVR AEIMRSFSLSTNLQESLRSKE

The protein has considerable clinical importance as a broad spectrumanti-viral, anti-proliferative and immunomodulating agent. Recombinantand other preparations of INFα2 have been used therapeutically in avariety of cancer and viral indications in man [reviewed in Sen, G. G.and Lengyel P, (1992), J. Biol. Chem. 267: 5017-5020]. However despitevery significant therapeutic benefit to large numbers of patients,resistance to therapy in certain patients has been documented and oneimportant mechanism of resistance has been shown to be the developmentof neutralising antibodies detectable in the serum of treated patients[Quesada, J. R. et al (1985) J. Clin. Oncology 3:1522-1528; Stein R. G.et al (1988) ibid; Russo, D. et al (1996) ibid; Brooks M. G. et al(1989) Gut 30: 1116-1122]. An immune response in these patients ismounted to the therapeutic interferon despite the fact that a moleculeof at least identical primary structure is produced endogenously in man.

Others have provided modified INFα2a and methods of use [U.S. Pat. No.4,496,537; U.S. Pat. No. 5,972,331; U.S. Pat. No. 5,480,640; U.S. Pat.No. 5,190,751; U.S. Pat. No. 4,959,210], but these approaches have beendirected towards improvements in the commercial production of INFα2a.Such teachings do not recognize the importance of T-cell epitopes to theimmunogenic properties of the protein nor have been conceived todirectly influence said properties in a specific and controlled wayaccording to the scheme of the present invention.

However, there is a continued need for INFα2a analogues with enhancedproperties. Desired enhancements include alternative schemes andmodalities for the expression and purification of the said therapeutic,but also and especially, improvements in the biological properties ofthe protein. There is a particular need for enhancement of the in vivocharacteristics when administered to the human subject. In this regard,it is highly desired to provide INFα2a with reduced or absent potentialto induce an immune response in the human subject.

SUMMARY AND DESCRIPTION OF THE INVENTION

The present invention provides for modified forms of human interferon α,and specifically the interferon α2 type, herein called “INFα2”, in whichthe immune characteristic is modified by means of reduced or removednumbers of potential T-cell epitopes.

The invention discloses sequences identified within the INFα2 primarysequence that are potential T-cell epitopes by virtue of MHC class IIbinding potential. This disclosure specifically pertains the human INFα2protein being 165 amino acid residues. The invention discloses alsospecific positions within the primary sequence of the molecule whichaccording to the invention are to be altered by specific amino acidsubstitution, addition or deletion without in principal affecting thebiological activity. In cases in which the loss of immunogenicity can beachieved only by a simultaneous loss of biological activity it ispossible to restore said activity by further alterations within theamino acid sequence of the protein.

The invention furthermore discloses methods to produce such modifiedmolecules, and above all methods to identify said T-cell epitopes whichrequire alteration in order to reduce or remove immunogenic sites.

The protein according to this invention would expect to display anincreased circulation time within the human subject and would be ofparticular benefit in chronic or recurring disease settings such as isthe case for a number of indications for INFα2. The present inventionprovides for modified forms of INFα2 proteins that are expected todisplay enhanced properties in vivo. These modified INFα2 molecules canbe used in pharmaceutical compositions.

In summary the invention relates to the following issues:

-   -   a modified molecule having the biological activity of human        interferon alpha 2 (INFα2) and being substantially        non-immunogenic or less immunogenic than any non-modified        molecule having the same biological activity when used in vivo;    -   a corresponding molecule, wherein said loss of immunogenicity is        achieved by removing one or more T-cell epitopes, preferably one        T-cell epitope, derived from the originally non-modified        molecule and/or by reduction in numbers of MHC allotypes able to        bind peptides derived from said molecule;    -   a corresponding molecule, wherein said originally present T-cell        epitopes are MHC class II ligands or peptide sequences which        show the ability to stimulate or bind T-cells via presentation        on MHC class II;    -   a corresponding molecule, wherein said ligands or peptide        sequences are 13 mer or 15 mer peptides;    -   a correspondingly molecule, wherein said peptide sequences are        selected from the group as depicted in FIG. 1.    -   a corresponding molecule, wherein 1-9 amino acid residues,        preferably one amino acid residue, in any of the originally        present T-cell epitopes are altered;    -   a corresponding molecule, wherein the alteration of the amino        acid residues is substitution, addition or deletion, preferably        substitution, of originally present amino acid(s) residue(s) by        other amino acid residue(s) at specific position(s);    -   a corresponding molecule, wherein one or more of the amino acid        residue substitutions are made as indicated in FIG. 2, and, in        addition, optionally one or more of the amino acid residue        substitutions are carried out as indicated in FIG. 3 for the        reduction in the number of MHC allotypes able to bind peptides        derived from said molecule;    -   a corresponding molecule, wherein additionally further        alteration, such as substitution, addition or deletion is        conducted to restore biological activity of said molecule;    -   a corresponding modified molecule, wherein the amino acid        alteration is made with reference to an homologous protein        sequence or with reference to in silico modeling techniques;

a modified molecule having the biological activity of human interferonalpha 2 (INFα2) and being substantially non-immunogenic or lessimmunogenic than any non-modified molecule having the same biologicalactivity when used in vivo, obtainable by alteration of one or moreamino acids in the primary sequence by (i) removing one or more T-cellepitopes derived from the originally non-modified molecule and being MHCclass II ligands or peptide sequences which show the ability tostimulate or bind T-cells via presentation on MHC class II, and/or (ii)by reduction in numbers of MHC allotypes able to bind peptides derivedfrom said molecule, wherein said modified molecule comprises alterationswhich are made at one or more positions within-. following strings ofcontiguous amino acid residues of said primary sequence derived from theINF□2 wild-type: (a) ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLH (R1), (b)FNLFSTKDSSAAWDE (R2), (c) KEDSILAVRKYFQRITLY (R3);

-   -   a corresponding molecule, wherein said alteration is        substitution of 1-9 amino acid residues;    -   a corresponding molecule, wherein said substitution is conducted        at one or more amino acid residues from the strings R1, R2 and        R3, preferably R2 and R3, and more preferably R3;    -   a corresponding molecule, wherein additionally one or more        substitutions of amino acid residues outside the sequence        strings R1, R2 or R3 are conducted;    -   a corresponding molecule comprising an amino acid residue        substitution made at one or more positions in the wild-type        molecule: 24, 26, 27, 38, 55, 63, 64, 66, 67, 76, 84, 85, 89,        103, 110, 111, 116, 117, 119, 122, 123, 126, 128, 129, 130, 153,        preferably at one or more of the following positions 26, 27, 38,        63, 85, 89, 103, 110, 111, 116, 117, 122, 123, 126, 128, 153,        more preferably 103, 110, 111, 116, 117, 122, 123, 126, 128,        153;    -   a preferred embodiment, wherein said substitution is made at one        or more positions selected from 26, 27, 38 and additionally at        one or more positions selected from 103, 110, 111, 116, 117,        122, 123, 126, 128, 153, or alternatively, selected from 63, 85,        89 and 103, 110, 111, 116, 117, 122, 123, 126, 128, 153;    -   a corresponding molecule, wherein said substitution is made at        one or more positions as specified in FIG. 4;    -   a corresponding molecule, wherein said substitution is made at        positions L26P, F27S, F38E and/or I63T, Y85S, Y89D, Y89E, Y89N        and/or V103E, L110G, M111T, M111S, M111E, I116S, I116Q, L117G,        L117A, Y122E, Y122Q, F123H, I126A, L128A, L153S;    -   a preferred corresponding molecule wherein said substitution is        made at positions L26P, F27S, F38E and/or I63T, Y85S, Y89D,        Y89E, Y89N and/or V103E, L110G, M111T, M111S, M111E, I116S,        I116Q, L117G, L117A;    -   a modified molecule having the biological activity of human        interferon alpha 2 (INFα2) and being substantially        non-immunogenic or less immunogenic than any non-modified        molecule having the same biological activity when used in vivo,        obtainable by substitution of one or more amino acids in the        primary sequence by (i) removing one or more T-cell epitopes        derived from the originally non-modified molecule and being MHC        class II ligands or peptide sequences which show the ability to        stimulate or bind T-cells via presentation on MHC class II,        and/or (ii) by reduction in numbers of MHC allotypes able to        bind peptides derived from said molecule, wherein said        substitution is made at one or more positions in a wild-type        molecule INFα2a or INFα2b corresponding to at least one of the        groups selected from:

(i) I24P, L26P, F27S, F38E, V55A,

(ii) I63T, L66A, F67D, F67E, W76H, F84D, F84E, Y85S, Y89D, Y89E, Y89N,

(iii) any position within sequence R3;

-   -   a corresponding molecule, whereby one or more of the following        substitutions are made within sequence R3: V103E, L110G, L110S,        M111T, M111S, M111E, I116S, I116Q, L117G, L117A, V119A, Y122Q,        Y122E, Y122H, F123H, I126A, L128A, Y129N, L130G, L130, L153S,        preferably Y122E, Y122Q, F123H, I126A, L128A, optionally        containing additional amino acid residue alterations, preferably        substitutions, which lead to a further diminished        immunogenicity;

a modified human interferon alpha 2 (INFα2) having reducedimmunogenicity consisting of the following sequence:CDLPQTHSLGSRRTLMLLAQMRX ⁰ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDLEACVIQGVGVTETPLMKEDSILAVRKX ¹ X ²QRX ³TX ⁴YLKEKKYSPCAWEVVRAEIMRSFSLSTNLQESLRSKE,wherein

-   -   X⁰ is R, K;    -   X¹ is Y, E, Q;    -   X² is F, H;    -   X³ is I, A; and    -   X⁴is L, A;        whereby simultaneously X¹=Y, X²=F, X³=I and X⁴=L are excluded        (this sequence corresponds to the wild-type IFNα2);

a modified human interferon alpha 2 (INFα2) having reducedimmunogenicity consisting of the following sequence:CDLPQTHSLGSRRTLMLLAQMRX ⁰ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQIFNJLFSTKDSSAAWDETLLDKFYTELYQQLNDLEAC VIQGVGVTETPX ¹ X²KEDSX ³ X ⁴AVRKX ⁵ X ⁶QRX ⁷TX ⁸YLKEKKYSPCAW EVVRAEIMRSFSX9STNLQESLRSKE,wherein

-   -   X⁰ is R, K;    -   X¹ is L, S, G,    -   X² is M, T, S, E,    -   X³ is I, S, Q,    -   X⁴ is L, G,    -   X⁵ is Y, E, Q;    -   X⁶ is F, H;    -   X⁷ is I, A;    -   X⁸ is L, A; and,    -   X⁹ is L, S        whereby simultaneously X¹=L, X²=M, X³=I, X⁴=L, X⁵=Y, X⁶=F, X⁷=I,        X⁸=L and X⁹=L are excluded;

a modified human interferon alpha 2 (INFα2) having reducedimmunogenicity consisting of the following sequence:CDLPQTHSLGSRRTLMLLAQMRX ⁰ISLFSCLKDFGFPQEEFGNQFQKAET IPVLHEMIQQX ¹ X ²NX³ X ⁴STKDSSAAX ⁵DETLLDKX ⁶ X ⁷TELX ⁸QQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVRAEIMRSFSLSTNLQESLRSKE,wherein

-   -   X⁰ is R, K;    -   X¹ is I, T;    -   X² is F, D, A;    -   X³ is L, A;    -   X⁴ is F, D, E;    -   X⁵ is W, H;    -   X⁶ is F, D, E;    -   X⁷ is Y, S and    -   X⁸ is Y, D, E, N;        whereby simultaneously X¹=I, X²=F, X³=L, X⁴=F, X⁵=W, X⁶=F, X⁷=Y        and X⁸=Y are excluded;

a modified human interferon alpha 2 (INFα2) having reducedimmunogenicity consisting of the following sequence:CDLPQTHSLGSRRTLMLLAQMRX ⁰ISLFSCLKDRHDFGFPQEEFGNQFQK AETIPVLHEMIQQX¹FNLFSTKDSSAAWDETLLDKFX ²TELX ³QQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVRAE IMRSFSLSTNLQESLRSKE,wherein

-   -   X⁰ is R, K;    -   X¹ is I, T;    -   X² is Y, S and    -   X³ is Y, D, E, N;        whereby simultaneously X¹=I, X²=Y and X³=Y are excluded;

a modified human interferon alpha 2 (INFα2) having reducedimmunogenicity consisting of the following sequence:CDLPQTHSLGSRRTLMLLAQMRX ⁰ISX ¹ X ²SCLKDRHDFGX ³PQEEFGNQ FQKAETIPX⁴LHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVRA EIMRSFSLSTNLQESLRSKE,wherein

-   -   X⁰ is R, K;    -   X¹ is; L, P,    -   X² is; F, S,    -   X³ is F, E and    -   X⁴ is; V, A        whereby simultaneously X¹=L, X²=F, X³=F and X⁴=V are excluded;        it should be pointed out that all exclusion specified in the        above formulas refer to and shall include all non-deimmunized        versions of IFNα2 which are known in the prior art;    -   corresponding modified INFα2 sequences, wherein additional        substitutions are made especially by the combinations as        indicated in the claims;    -   a corresponding modified INFα2 sequence, wherein additional        substitutions are made, preferably at one ore more positions        within partial sequence R1 and/or R2 and /or R3 (whereby R1, R2,        R3 are defined as indicated above);    -   a DNA sequence coding for a modified INFα2 as described above        and below;    -   a pharmaceutical composition comprising a modified molecule        having the biological activity of INFα2 as defined above,        optionally together with a pharmaceutically acceptable carrier,        diluent or excipient;    -   a method for manufacturing a modified molecule having the        biological activity of INFα2 as defined above and below        comprising the following steps: (i) determining the amino acid        sequence of the polypeptide or part thereof,

(ii) identifying one or more potential T-cell epitopes within the aminoacid sequence of the protein by any method including determination ofthe binding of the peptides to MHC molecules using in vitro or in silicotechniques or biological assays, (iii) designing new sequence variantswith one or more amino acids within the identified potential T-cellepitopes modified in such a way to substantially reduce or eliminate theactivity of the T-cell epitope as determined by the binding of thepeptides to MHC molecules using in vitro or in silico techniques orbiological assays ,or by binding of peptide-MHC complexes to T-cells,(iv) constructing such sequence variants by recombinant DNA techniquesand testing said variants in order to identify one or more variants withdesirable properties, and (v) optionally repeating steps (ii)-(iv);

-   -   a corresponding method, wherein step (iii) is carried out by        substitution, addition or deletion of 1-9 amino acid residues in        any of the originally present T-cell epitopes or with reference        to an homologous protein sequence and/or in silico modeling        techniques;    -   a corresponding method, wherein step (ii) is carried out by the        following steps: (a) selecting a region of the peptide having a        known amino acid residue sequence; (b) sequentially sampling        overlapping amino acid residue segments of predetermined uniform        size and constituted by at least three amino acid residues from        the selected region; (c) calculating MHC Class II molecule        binding score for each said sampled segment by summing assigned        values for each hydrophobic amino acid residue side chain        present in said sampled amino acid residue segment; and (d)        identifying at least one of said segments suitable for        modification, based on the calculated MHC Class II molecule        binding score for that segment, to change overall MHC Class II        binding score for the peptide without substantially the reducing        therapeutic utility of the peptide;    -   a corresponding method, wherein step (c) is carried out by using        a Böhm scoring function modified to include 12-6 van der Waal's        ligand-protein energy repulsive term and ligand conformational        energy term by (1) providing a first data base of MHC Class II        molecule models; (2) providing a second data base of allowed        peptide backbones for said MHC Class II molecule models; (3)        selecting a model from said first data base; (4) selecting an        allowed peptide backbone from said second data base; (5)        identifying amino acid residue side chains present in each        sampled segment; (6) determining the binding affinity value for        all side chains present in each sampled segment; and repeating        steps (1) through (5) for each said model and each said        backbone;    -   a peptide molecule, which is a T-cell epitope, consisting of 13        consecutive amino acid residues having a potential MHC class II        binding activity and created from the primary sequence of        non-modified INFα2, selected from the group as depicted in FIG.        1, FIG. 6 a-c;    -   a peptide molecule consisting of 15, preferably at least 9,        consecutive amino acid residues having a potential MHC class II        binding activity and created from the primary sequence of        non-modified INFa2, selected from any of the groups of partial        sequences R1, R2, R3 or selected from FIG. 7;    -   a peptide molecule consisting of 9 -15 consecutive amino acid        residues, having a potential MHC class II binding activity and        created from the primary sequence of non-modified INFα2, whereby        said molecule has a stimulation index of at least 1.8,        preferably 1.8-2, more preferably>2, in a biological assay of        cellular proliferation, wherein said index is taken as the value        of cellular proliferation scored following stimulation by a        peptide and divided by the value of cellular proliferation        scored in control cells not in receipt peptide and wherein        cellular proliferation is measured by any suitable means        according to standard methods as described in more detail in the        Examples;    -   a corresponding peptide molecule having such stimulation index        value consisting of any of the sequence as indicated in FIG. 6        or 7;    -   a corresponding peptide molecule having such stimulation index        value comprising at least 9 consecutive amino acid residues from        any of the INFα2 partial sequences R1, R2, R3 as defined in        claim above;    -   a use of a corresponding peptide, for the manufacture of INFα2        having substantially no or less immunogenicity than any        non-modified molecule with the same or acceptably reduced degree        of biological activity when used in vivo;    -   a pharmaceutical composition consisting of a synthetic peptide        sequence as specified above and below and in the Figures having        the biological activity of IFNα2, optionally together with a        pharmaceutically acceptable carrier, diluent or excipient.

The term “T-cell epitope” means according to the understanding of thisinvention an amino acid sequence which is able to bind MHC class II,able to stimulate T-cells and/or also to bind (without necessarilymeasurably activating) T-cells in complex with MHC class II.

The term “peptide” as used herein and in the appended claims, is acompound that includes two or more amino acids. The amino acids arelinked together by a peptide bond (defined herein below). There are 20different naturally occurring amino acids involved in the biologicalproduction of peptides, and any number of them may be linked in anyorder to form a peptide chain or ring. The naturally occurring aminoacids employed in the biological production of peptides all have theL-configuration. Synthetic peptides can be prepared employingconventional synthetic methods, utilizing L-amino acids, D-amino acids,or various combinations of amino acids of the two differentconfigurations. Some peptides contain only a few amino acid units. Shortpeptides, e.g., having less than ten amino acid units, are sometimesreferred to as “oligopeptides”. Other peptides contain a large number ofamino acid residues, e.g. up to 100 or more, and are referred to as“polypeptides”. By convention, a “polypeptide” may be considered as anypeptide chain containing three or more amino acids, whereas a“oligopeptide” is usually considered as a particular type of “short”polypeptide. Thus, as used herein, it is understood that any referenceto a “polypeptide” also includes an oligopeptide. Further, any referenceto a “peptide” includes polypeptides, oligopeptides, and proteins. Eachdifferent arrangement of amino acids forms different polypeptides orproteins. The number of polypeptides—and hence the number of differentproteins—that can be formed is practically unlimited. “Alpha carbon(Cα)” is the carbon atom of the carbon-hydrogen (CH) component that isin the peptide chain. A “side chain” is a pendant group to Cα that cancomprise a simple or complex group or moiety, having physical dimensionsthat can vary significantly compared to the dimensions of the peptide.

The invention may be applied to any INFα2 species of molecule withsubstantially the same primary amino acid sequences as those disclosedherein and would include therefore INFα2 molecules derived by geneticengineering means or other processes and may contain more or less than165 amino acid residues.

INFα2 proteins such as identified from other mammalian sources have incommon many of the peptide sequences of the present disclosure and havein common many peptide sequences with substantially the same sequence asthose of the disclosed listing. Such protein sequences equally thereforefall under the scope of the present invention. The invention isconceived to overcome the practical reality that soluble proteinsintroduced into autologous organisms can trigger an immune responseresulting in development of host antibodies that bind to the solubleprotein. A prominent example of this phenomenon amongst others, is theclinical use INFα2. A significant proportion of human patients treatedwith INFα2 make antibodies despite the fact that this protein isproduced endogenously [Russo, D. et al (1996) ibid; Stein, R. et al(1988) ibid]. The present invention seeks to address this by providingINFα2 proteins with altered propensity to elicit an immune response onadministration to the human host. According to the methods describedherein, the inventors have discovered and now disclose the regions ofthe INFa2 molecule comprising the critical T-cell epitopes driving theimmune responses to this autologous protein.

The general method of the present invention leading to the modifiedINFα2 comprises the following steps:

(a) determining the amino acid sequence of the polypeptide or partthereof;

(b) identifying one or more potential T-cell epitopes within the aminoacid sequence of the protein by any method including determination ofthe binding of the peptides to MHC molecules using in vitro or in silicotechniques or biological assays;

(c) designing new sequence variants with one or more amino acids withinthe identified potential T-cell epitopes modified in such a way tosubstantially reduce or eliminate the activity of the T-cell epitope asdetermined by the binding of the peptides to MHC molecules using invitro or in silico techniques or biological assays. Such sequencevariants are created in such a way to avoid creation of new potentialT-cell epitopes by the sequence variations unless such new potentialT-cell epitopes are, in turn, modified in such a way to substantiallyreduce or eliminate the activity of the T-cell epitope; and

(d) constructing such sequence variants by recombinant DNA techniquesand testing said variants in order to identify one or more variants withdesirable properties according to well known recombinant techniques.

The identification of potential T-cell epitopes according to step (b)can be carried out according to methods describes previously in theprior art. Suitable methods are disclosed in WO 98/59244; WO 98/52976;WO 00/34317 and may preferably be used to identify binding propensity ofINFα2a -derived peptides to an MHC class II molecule.

Another very efficacious method for identifying T-cell epitopes bycalculation is described in the EXAMPLE 1 which is a preferredembodiment according to this invention.

In practice a number of variant INFα2 proteins will be produced andtested for the desired immune and functional characteristic. The variantproteins will most preferably be produced by recombinant DNA techniquesalthough other procedures including chemical synthesis of INFα2fragments may be contemplated.

The results of an analysis according to step (b) of the above scheme andpertaining to the human INFα2a protein sequence of 165 amino acidresidues is presented in FIG. 1. The results of a design and constructsaccording to step (c) and (d) of the above scheme and pertaining to themodified molecule of this invention is presented in FIGS. 2 and 3.

The invention relates to INFα2 analogues in which substitutions of atleast one amino acid residue have been made at positions resulting in asubstantial reduction in activity of or elimination of one or morepotential T-cell epitopes from the protein. One or more amino acidsubstitutions at particular points within any of the potential MHC classII ligands identified in Table 1 may result in a INFα2 molecule with areduced immunogenic potential when administered as a therapeutic to thehuman host.

It is most preferred to provide an INFα2 molecule in which amino acidmodification (e.g. a substitution) is conducted within the mostimmunogenic regions of the parent molecule. The inventors herein havediscovered that the most immunogenic regions of the INFα2 molecule inman are confined to three regions R1, R2 and R3 comprising respectivelyamino acid sequences; ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLH;FNLFSTKDSSAAWDE and KEDSILAVRKYFQRITLY. The major preferred embodimentsof the present invention comprise INFa2 molecules for which the MHCclass II ligands of FIG. 1 and which align either in their entirety orto a minimum of 9 amino acid residues with any of the above sequenceelements R1, R2 or R3 are altered such as to eliminate binding orotherwise reduce the numbers of MHC allotypes to which the peptide canbind.

The preferred embodiments of the invention include the specificsubstitutions of FIG. 4. It is particularly preferred to providemodified INFα2 molecules containing combinations of substitutions fromFIG. 4. Combinations which comprise modification to each of theimmunogenic regions R1, R2 and R3 are preferred, and combinationscomprising modifications to R2 and R3 are especially preferred althoughsuch preference is not intended to limit the combinations ofsubstitution which are considered desirable.

For the elimination of T-cell epitopes, amino acid substitutions arepreferably made at appropriate points within the peptide sequencepredicted to achieve substantial reduction or elimination of theactivity of the T-cell epitope. In practice an appropriate point willpreferably equate to an amino acid residue binding within one of thepockets provided within the MHC class II binding groove.

It is most preferred to alter binding within the first pocket of thecleft at the so-called P1 or P1 anchor position of the peptide. Thequality of binding interaction between the P1 anchor residue of thepeptide and the first pocket of the MHC class II binding groove isrecognized as being a major determinant of overall binding affinity forthe whole peptide. An appropriate substitution at this position of thepeptide will be for a residue less readily accommodated within thepocket, for example, substitution to a more hydrophilic residue. Aminoacid residues in the peptide at positions equating to binding withinother pocket regions within the MHC binding cleft are also consideredand fall under the scope of the present.

It is understood that single amino acid substitutions within a givenpotential T-cell epitope are the most preferred route by which theepitope may be eliminated. Combinations of substitution within a singleepitope may be contemplated and for example can be particularlyappropriate where individually defined epitopes are in overlap with eachother. Moreover, amino acid substitutions either singly within a givenepitope or in combination within a single epitope may be made atpositions not equating to the “pocket residues” with respect to the MHCclass II binding groove, but at any point within the peptide sequence.Substitutions may be made with reference to an homologues structure orstructural method produced using in silico techniques known in the artand may be based on known structural features of the molecule accordingto this invention. All such substitutions fall within the scope of thepresent invention.

Amino acid substitutions other than within the peptides identified abovemay be contemplated particularly when made in combination withsubstitution(s) made within a listed peptide. For example a change maybe contemplated to restore structure or biological activity of thevariant molecule. Such compensatory changes and changes to includedeletion or addition of particular amino acid residues from the INFα2polypeptide resulting in a variant with desired activity and incombination with changes in any of the disclosed peptides fall under thescope of the present.

In as far as this invention relates to modified INFα2, compositionscontaining such modified INFα2 proteins or fragments of modified INFα2proteins and related compositions should be considered within the scopeof the invention. In another aspect, the present invention relates tonucleic acids encoding modified INFα2 entities. In a further aspect thepresent invention relates to methods for therapeutic treatment of humansusing the modified INFα2 proteins.

SHORT DESCRIPTION OF THE FIGURES

The invention will now be illustrated, but not limited, by the followingexamples. The examples refer to the following drawings. The amino acidresidues are consequently depicted as one-letter code.

FIG. 1 provides peptide sequences in human INFα2a with potential humanMHC class II binding activity.

FIG. 2 provides substitutions leading to the elimination of T-cellepitopes of human INFα2a (WT=wild-type residue).

FIG. 3 provides additional substitutions leading to the removal of apotential T-cell epitope for one or more MHC allotypes.

FIG. 4 provides preferred substitutions in human INFα2a (WT=wild-typeresidue, MUT=desired residue).

FIG. 5 provides a table of the INFα2 13-mer synthetic peptides sequencesanalysed using an MHC class II in vitro binding assay of EXAMPLE 2.

FIG. 6 shows the results of in vitro MHC peptide binding assays for MHCallotypes. a) indicates peptides with high affinity binding (0%inhibition by competitor reference peptide) for each of the MHCallotypes tested; b) indicates peptides with medium affinity (0-50%inhibition by competitor) binding for each of the MHC allotypes tested;c) indicates peptides with low (50-100% inhibition by competitor)affinity binding for each of the MHC allotypes tested and d) indicatespeptides with no detectable binding to the MHC allotypes tested.

FIG. 7 provides a table of the INFα2 15-mer peptide sequences analysedusing the naïve human in vitro T-cell assay of EXAMPLE 3. The peptideID# and position of the N-terminal peptide residue within the INFα2sequence is indictated.

FIG. 8 shows cumulative stimulation indexes from 6 individuals thatrespond to stimulation with IFNα peptides. Six donors from 20 screenedresponded to stimulation with one or more of 51 15 mer peptides from theIFNα sequence. Responses to individual peptides are grouped into threedistinct regions with region three containing the most immunogenicpeptides #38 and #39 (arrows). Control peptides C32 (DRB1-restricted)and C49 (DP-restricted) are included for comparison. Cross-hatchingwithin each bar indicates the contribution from individual donors.

FIG. 9 shows the immunogenic regions within INFα and details the peptidesequences from these regions able to stimulate naïve human T-cells.

FIG. 10 provides a table indicating INFα peptides capable of promotingproliferation of naïve human T-cells in vitro. For 5 of the donors,responses are recorded to multiple overlapping peptides from the majorepitope regions R1, R2 and R3. For 3 of the donors, responses arerecorded to individual synthetic peptides from R1, R2 or R3.

FIG. 11 provides a table showing frequency of MHC class II alleles inthe responding and non-responding donors to IFNα peptides. a=Numeratoris number of donors with DR allele, denominator is number of donors thatshowed T cell proliferation in vitro to that peptide (total number ofresponding donors=6). b=Frequency of allele in donor population.Peptides for which two or fewer responses were recorded were notevaluated. All responding donors tested negative for DRB1*14. TheDRB1*14 allotype has a frequency of 1.5% in the 20 donors tested.Allorestriction of a given peptide is determined by the frequency of anallele in the donor population and the number of responding donors thatexpress the same allele. If a peptide is associated with any particularallele (allorestricted) then the percentage shown would be expected tobe greater than the frequency for the allele in the population.

FIG. 12 provides tables of IC₅₀ values for 15-mer synthetic peptides incompetition binding assay for particular MHC class II allotypes.

(a) Competition MHC class II peptide binding assay to determine relativebinding affinities of INFα peptides capable of promoting proliferationof naïve human T-cells in vitro to DRB1*0101. INFα peptides wereincubated with fixed HOM-2 cells in the presence of 10 μM biotinylatedinfluenza haemagglutinin 307-319. The concentration of competitorpeptide causing 50% inhibition of maximun biotinylated peptide bindingwas taken as the IC₅₀. Influenza 103-115 was included as a high affinitycontrol. IC₅₀≦20 μM=high affinity, IC₅₀=20-100 μM=Medium Affinity,IC₅₀≧100 uM=Low Affinity.

(b) Competition MHC class II peptide binding assay to determine relativebinding affinities of INFα peptides capable of promoting proliferationof naïve human T-cells in vitro to DRB1*0701. INFα peptides wereincubated with fixed MOU (MANN) cells in the presence of 10 uMbiotinylated tetanus toxin 828-840. The concentration of competitorpeptide causing 50% inhibition of maximun biotinylated peptide bindingwas taken as the IC₅₀. Tetanus toxin 828-840 was included as a highaffinity control. IC₅₀≦20 μM=high affinity, IC₅₀=20-100 μM=MediumAffinity, IC₅₀≧100 μM=Low Affinity.

(c) Competition MHC class II peptide binding assay to determine relativebinding affinities of INFα peptides capable of promoting proliferationof naïve human T-cells in vitro to DRB1*0401. INFα peptides wereincubated with fixed WT-51 cells in the presence of 50 uM biotinylatedinfluenza haemagglutinin 307-319. The concentration of competitorpeptide causing 50% inhibition of maximun biotinylated peptide bindingwas taken as the IC₅₀. Influenza 103-115 was included as a high affinitycontrol. IC₅₀≦20 μM=high affinity, IC₅₀=20-100 μM=Medium Affinity,IC₅₀≧100 μM=Low Affinity.

FIG. 13 provides a table detailing substitutions within INFα whichprovide molecules with retained activity in the anti-proliferation assayof EXAMPLE 7. WT=wild-type residue; #=residue number; Mut=mutationconducted. Epitope Region indicates location of substitution withrespect to immunogenic epitope regions R1, R2 or R3.

FIG. 14 provides representative data of the anti-proliferative effect ofselected mutant INFα2 molecules. Assays were conducted according to themethods of EXAMPLE 7. Panel a) shows activity of molecules withsubstitution within immunogenic epitope R1. Panel b) shows activity ofmolecules with substitution within immunogenic epitope R2. Panel c)shows activity of molecules with substitution within immunogenic epitopeR3.

FIG. 15 provides panels that show individual donor responses to INFαsynthetic peptides. Data from control peptides C32 (DRB1-restricted) andC49 (DP-restricted) are included for comparison. Immunogenic regions R1,R2, R3 are indicated on relevant panels. Threshold for positivestimulation index=2.

In the following Examples the invention is described in more detailwhich shall not be interpreted as a limitation or restriction.

EXAMPLE 1

There are a number of factors that play important roles in determiningthe total structure of a protein or polypeptide. First, the peptidebond, i.e., that bond which joins the amino acids in the chain together,is a covalent bond. This bond is planar in structure, essentially asubstituted amide. An “amide” is any of a group of organic compoundscontaining the grouping —CONH—.

The planar peptide bond linking Cα of adjacent amino acids may berepresented as depicted below:

Because the O═C and the C—N atoms lie in a relatively rigid plane, freerotation does not occur about these axes. Hence, a plane schematicallydepicted by the interrupted line is sometimes referred to as an “amide”or “peptide plane” plane wherein lie the oxygen (O), carbon (C),nitrogen (N), and hydrogen (H) atoms of the peptide backbone. Atopposite corners of this amide plane are located the Cα atoms. Sincethere is substantially no rotation about the O═C and C—N atoms in thepeptide or amide plane, a polypeptide chain thus comprises a series ofplanar peptide linkages joining the Cα atoms.

A second factor that plays an important role in defining the totalstructure or conformation of a polypeptide or protein is the angle ofrotation of each amide plane about the common Cα linkage. The terms“angle of rotation” and “torsion angle” are hereinafter regarded asequivalent terms. Assuming that the O, C, N, and H atoms remain in theamide plane (which is usually a valid assumption, although there may besome slight deviations from planarity of these atoms for someconformations), these angles of rotation define the N and Rpolypeptide's backbone conformation, i.e., the structure as it existsbetween adjacent residues. These two angles are known as φ and ψ. A setof the angles φ₁, ψ₁, where the subscript represents a particularresidue of a polypeptide chain, thus effectively defines the polypeptidesecondary structure. The conventions used in defining the φ, ψ angles,i.e., the reference points at which the amide planes form a zero degreeangle, and the definition of which angle is φ, and which angle is ψ, fora given polypeptide, are defined in the literature (see, e.g.,Ramachandran et al. Adv. Prot. Chem. 23:283-437 (1968), at pages 285-94,which pages are incorporated herein by reference).

The present method can be applied to any protein, and is based in partupon the discovery that in humans the primary Pocket 1 anchor positionof MHC Class II molecule binding grooves has a well designed specificityfor particular amino acid side chains. The specificity of this pocket isdetermined by the identity of the amino acid at position 86 of the betachain of the MHC Class II molecule. This site is located at the bottomof Pocket 1 and determines the size of the side chain that can beaccommodated by this pocket. Marshall, K. W., J. Immunol., 152:4946-4956(1994). If this residue is a glycine, then all hydrophobic aliphatic andaromatic amino acids (hydrophobic aliphatics being: valine, leucine,isoleucine, methionine and aromatics being: phenylalanine, tyrosine andtryptophan) can be accommodated in the pocket, a preference being forthe aromatic side chains. If this pocket residue is a valine, then theside chain of this amino acid protrudes into the pocket and restrictsthe size of peptide side chains that can be accommodated such that onlyhydrophobic aliphatic side chains can be accommodated. Therefore, in anamino acid residue sequence, wherever an amino acid with a hydrophobicaliphatic or aromatic side chain is found, there is the potential for aMHC Class II restricted T-cell epitope to be present. If the side-chainis hydrophobic aliphatic, however, it is approximately twice as likelyto be associated with a T-cell epitope than an aromatic side chain(assuming an approximately even distribution of Pocket 1 typesthroughout the global population).

A computational method embodying the present invention profiles thelikelihood of peptide regions to contain T-cell epitopes as follows:

(1) The primary sequence of a peptide segment of predetermined length isscanned, and all hydrophobic aliphatic and aromatic side chains presentare identified. (2) The hydrophobic aliphatic side chains are assigned avalue greater than that for the aromatic side chains; preferably abouttwice the value assigned to the aromatic side chains, e.g., a value of 2for a hydrophobic aliphatic side chain and a value of 1 for an aromaticside chain. (3) The values determined to be present are summed for eachoverlapping amino acid residue segment (window) of predetermined uniformlength within the peptide, and the total value for a particular segment(window) is assigned to a single amino acid residue at an intermediateposition of the segment (window), preferably to a residue at about themidpoint of the sampled segment (window). This procedure is repeated foreach sampled overlapping amino acid residue segment (window). Thus, eachamino acid residue of the peptide is assigned a value that relates tothe likelihood of a T-cell epitope being present in that particularsegment (window). (4) The values calculated and assigned as described inStep 3, above, can be plotted against the amino acid coordinates of theentire amino acid residue sequence being assessed. (5) All portions ofthe sequence which have a score of a predetermined value, e.g., a valueof 1, are deemed likely to contain a T-cell epitope and can be modified,if desired.

This particular aspect of the present invention provides a generalmethod by which the regions of peptides likely to contain T-cellepitopes can be described. Modifications to the peptide in these regionshave the potential to modify the MHC Class II binding characteristics.

According to another aspect of the present invention, T-cell epitopescan be predicted with greater accuracy by the use of a moresophisticated computational method which takes into account theinteractions of peptides with models of MHC Class II alleles. Thecomputational prediction of T-cell epitopes present within a peptideaccording to this particular aspect contemplates the construction ofmodels of at least 42 MHC Class II alleles based upon the structures ofall known MHC Class II molecules and a method for the use of thesemodels in the computational identification of T-cell epitopes, theconstruction of libraries of peptide backbones for each model in orderto allow for the known variability in relative peptide backbone alphacarbon (Cα) positions, the construction of libraries of amino-acid sidechain conformations for each backbone dock with each model for each ofthe 20 amino-acid alternatives at positions critical for the interactionbetween peptide and MHC Class II molecule, and the use of theselibraries of backbones and side-chain conformations in conjunction witha scoring function to select the optimum backbone and side-chainconformation for a particular peptide docked with a particular MHC ClassII molecule and the derivation of a binding score from this interaction.

Models of MHC Class II molecules can be derived via homology modelingfrom a number of similar structures found in the Brookhaven Protein DataBank (“PDB”). These may be made by the use of semi-automatic homologymodeling software (Modeller, Sali A. & Blundell T L., 1993. J. Mol Biol234:779-815) which incorporates a simulated annealing function, inconjunction with the CHARMm force-field for energy minimisation(available from Molecular Simulations Inc., San Diego, Calif.).Alternative modeling methods can be utilized as well.

The present method differs significantly from other computationalmethods which use libraries of experimentally derived binding data ofeach amino-acid alternative at each position in the binding groove for asmall set of MHC Class II molecules (Marshall, K. W., et al., Biomed.Pept. Proteins Nucleic Acids, 1(3):157-162) (1995) or yet othercomputational methods which use similar experimental binding data inorder to define the binding characteristics of particular types ofbinding pockets within the groove, again using a relatively small subsetof MHC Class II molecules, and then ‘mixing and matching’ pocket typesfrom this pocket library to artificially create further ‘virtual’ MHCClass II molecules (Sturniolo T., et al., Nat. Biotech, 17(6): 555-561(1999). Both prior methods suffer the major disadvantage that, due tothe complexity of the assays and the need to synthesize large numbers ofpeptide variants, only a small number of MHC Class II molecules can beexperimentally scanned. Therefore the first prior method can only makepredictions for a small number of MHC Class II molecules. The secondprior method also makes the assumption that a pocket lined with similaramino-acids in one molecule will have the same binding characteristicswhen in the context of a different Class II allele and suffers furtherdisadvantages in that only those MHC Class II molecules can be‘virtually’ created which contain pockets contained within the pocketlibrary. Using the modeling approach described herein, the structure ofany number and type of MHC Class II molecules can be deduced, thereforealleles can be specifically selected to be representative of the globalpopulation. In addition, the number of MHC Class II molecules scannedcan be increased by making further models further than having togenerate additional data via complex experimentation.

The use of a backbone library allows for variation in the positions ofthe Cα atoms of the various peptides being scanned when docked withparticular MHC Class II molecules. This is again in contrast to thealternative prior computational methods described above which rely onthe use of simplified peptide backbones for scanning amino-acid bindingin particular pockets. These simplified backbones are not likely to berepresentative of backbone conformations found in ‘real’ peptidesleading to inaccuracies in prediction of peptide binding. The presentbackbone library is created by superposing the backbones of all peptidesbound to MHC Class II molecules found within the Protein Data Bank andnoting the root mean square (RMS) deviation between the Cα atoms of eachof the eleven amino-acids located within the binding groove. While thislibrary can be derived from a small number of suitable available mouseand human structures (currently 13), in order to allow for thepossibility of even greater variability, the RMS figure for each C″-αposition is increased by 50%. The average Cα position of each amino-acidis then determined and a sphere drawn around this point whose radiusequals the RMS deviation at that position plus 50%. This sphererepresents all allowed Cα positions.

Working from the Cα with the least RMS deviation (that of the amino-acidin Pocket 1 as mentioned above, equivalent to Position 2 of the 11residues in the binding groove), the sphere is three-dimensionallygridded, and each vertex within the grid is then used as a possiblelocation for a Cα of that amino-acid. The subsequent amide plane,corresponding to the peptide bond to the subsequent amino-acid isgrafted onto each of these Cαs and the φ and ψ angles are rotatedstep-wise at set intervals in order to position the subsequent Cα. Ifthe subsequent Cα falls within the ‘sphere of allowed positions’ forthis Cα than the orientation of the dipeptide is accepted, whereas if itfalls outside the sphere then the dipeptide is rejected.

This process is then repeated for each of the subsequent Cα positions,such that the peptide grows from the Pocket 1 Cα ‘seed’, until all ninesubsequent Cαs have been positioned from all possible permutations ofthe preceding Cαs. The process is then repeated once more for the singleCα preceding pocket 1 to create a library of backbone Cα positionslocated within the binding groove.

The number of backbones generated is dependent upon several factors: Thesize of the ‘spheres of allowed positions’; the fineness of the griddingof the ‘primary sphere’ at the Pocket 1 position; the fineness of thestep-wise rotation of the φ and ψ angles used to position subsequentCαs. Using this process, a large library of backbones can be created.The larger the backbone library, the more likely it will be that theoptimum fit will be found for a particular peptide within the bindinggroove of an MHC Class II molecule. Inasmuch as all backbones will notbe suitable for docking with all the models of MHC Class II moleculesdue to clashes with amino-acids of the binding domains, for each allelea subset of the library is created comprising backbones which can beaccommodated by that allele.

The use of the backbone library, in conjunction with the models of MHCClass II molecules creates an exhaustive database consisting of allowedside chain conformations for each amino-acid in each position of thebinding groove for each MHC Class II molecule docked with each allowedbackbone. This data set is generated using a simple steric overlapfunction where a MHC Class II molecule is docked with a backbone and anamino-acid side chain is grafted onto the backbone at the desiredposition. Each of the rotatable bonds of the side chain is rotatedstep-wise at set intervals and the resultant positions of the atomsdependent upon that bond noted. The interaction of the atom with atomsof side-chains of the binding groove is noted and positions are eitheraccepted or rejected according to the following criteria: The sum totalof the overlap of all atoms so far positioned must not exceed apre-determined value. Thus the stringency of the conformational searchis a function of the interval used in the step-wise rotation of the bondand the pre-determined limit for the total overlap. This latter valuecan be small if it is known that a particular pocket is rigid, howeverthe stringency can be relaxed if the positions of pocket side-chains areknown to be relatively flexible. Thus allowances can be made to imitatevariations in flexibility within pockets of the binding groove. Thisconformational search is then repeated for every amino-acid at everyposition of each backbone when docked with each of the MHC Class IImolecules to create the exhaustive database of side-chain conformations.

A suitable mathematical expression is used to estimate the energy ofbinding between models of MHC Class II molecules in conjunction withpeptide ligand conformations which have to be empirically derived byscanning the large database of backbone/side-chain conformationsdescribed above. Thus a protein is scanned for potential T-cell epitopesby subjecting each possible peptide of length varying between 9 and 20amino-acids (although the length is kept constant for each scan) to thefollowing computations: An MHC Class II molecule is selected togetherwith a peptide backbone allowed for that molecule and the side-chainscorresponding to the desired peptide sequence are grafted on. Atomidentity and interatomic distance data relating to a particularside-chain at a particular position on the backbone are collected foreach allowed conformation of that amino-acid (obtained from the databasedescribed above). This is repeated for each side-chain along thebackbone and peptide scores derived using a scoring function. The bestscore for that backbone is retained and the process repeated for eachallowed backbone for the selected model. The scores from all allowedbackbones are compared and the highest score is deemed to be the peptidescore for the desired peptide in that MHC Class II model. This processis then repeated for each model with every possible peptide derived fromthe protein being scanned, and the scores for peptides versus models aredisplayed.

In the context of the present invention, each ligand presented for thebinding affinity calculation is an amino-acid segment selected from apeptide or protein as discussed above. Thus, the ligand is a selectedstretch of amino acids about 9 to 20 amino acids in length derived froma peptide, polypeptide or protein of known sequence. The terms “aminoacids” and “residues” are hereinafter regarded as equivalent terms.

The ligand, in the form of the consecutive amino acids of the peptide tobe examined grafted onto a backbone from the backbone library, ispositioned in the binding cleft of an MHC Class II molecule from the MHCClass II molecule model library via the coordinates of the C″-α atoms ofthe peptide backbone and an allowed conformation for each side-chain isselected from the database of allowed conformations. The relevant atomidentities and interatomic distances are also retrieved from thisdatabase and used to calculate the peptide binding score. Ligands with ahigh binding affinity for the MHC Class II binding pocket are flagged ascandidates for site-directed mutagenesis. Amino-acid substitutions aremade in the flagged ligand (and hence in the protein of interest) whichis then retested using the scoring function in order to determinechanges which reduce the binding affinity below a predeterminedthreshold value. These changes can then be incorporated into the proteinof interest to remove T-cell epitopes.

Binding between the peptide ligand and the binding groove of MHC ClassII molecules involves non-covalent interactions including, but notlimited to: hydrogen bonds, electrostatic interactions, hydrophobic(lipophilic) interactions and Van der Walls interactions. These areincluded in the peptide scoring function as described in detail below.

It should be understood that a hydrogen bond is a non-covalent bondwhich can be formed between polar or charged groups and consists of ahydrogen atom shared by two other atoms. The hydrogen of the hydrogendonor has a positive charge where the hydrogen acceptor has a partialnegative charge. For the purposes of peptide/protein interactions,hydrogen bond donors may be either nitrogens with hydrogen attached orhydrogens attached to oxygen or nitrogen. Hydrogen bond acceptor atomsmay be oxygens not attached to hydrogen, nitrogens with no hydrogensattached and one or two connections, or sulphurs with only oneconnection. Certain atoms, such as oxygens attached to hydrogens orimine nitrogens (e.g. C═NH) may be both hydrogen acceptors or donors.Hydrogen bond energies range from 3 to 7 Kcal/mol and are much strongerthan Van der Waal's bonds, but weaker than covalent bonds. Hydrogenbonds are also highly directional and are at their strongest when thedonor atom, hydrogen atom and acceptor atom are co-linear.

Electrostatic bonds are formed between oppositely charged ion pairs andthe strength of the interaction is inversely proportional to the squareof the distance between the atoms according to Coulomb's law. Theoptimal distance between ion pairs is about 2.8 Å. In protein/peptideinteractions, electrostatic bonds may be formed between arginine,histidine or lysine and aspartate or glutamate. The strength of the bondwill depend upon the pKa of the ionizing group and the dielectricconstant of the medium although they are approximately similar instrength to hydrogen bonds.

Lipophilic interactions are favorable hydrophobic-hydrophobic contactsthat occur between he protein and peptide ligand. Usually, these willoccur between hydrophobic amino acid side chains of the peptide buriedwithin the pockets of the binding groove such that they are not exposedto solvent. Exposure of the hydrophobic residues to solvent is highlyunfavorable since the surrounding solvent molecules are forced tohydrogen bond with each other forming cage-like clathrate structures.The resultant decrease in entropy is highly unfavorable. Lipophilicatoms may be sulphurs which are neither polar nor hydrogen acceptors andcarbon atoms which are not polar.

Van der Waal's bonds are non-specific forces found between atoms whichare 3-4 Å apart. They are weaker and less specific than hydrogen andelectrostatic bonds. The distribution of electronic charge around anatom changes with time and, at any instant, the charge distribution isnot symmetric. This transient asymmetry in electronic charge induces asimilar asymmetry in neighboring atoms. The resultant attractive forcesbetween atoms reaches a maximum at the Van der Waal's contact distancebut diminishes very rapidly at about 1 Å to about 2 Å. Conversely, asatoms become separated by less than the contact distance, increasinglystrong repulsive forces become dominant as the outer electron clouds ofthe atoms overlap. Although the attractive forces are relatively weakcompared to electrostatic and hydrogen bonds (about 0.6 Kcal/mol), therepulsive forces in particular may be very important in determiningwhether a peptide ligand may bind successfully to a protein.

In one embodiment, the Böhm scoring function (SCORE1 approach) is usedto estimate the binding constant. (Böhm, H. J., J. Comput Aided Mol.Des., 8(3):243-256 (1994) which is hereby incorporated in its entirety).In another embodiment, the scoring function (SCORE2 approach) is used toestimate the binding affinities as an indicator of a ligand containing aT-cell epitope (Böhm, H. J., J. Comput Aided Mol. Des., 12(4):309-323(1998) which is hereby incorporated in its entirety). However, the Böhmscoring functions as described in the above references are used toestimate the binding affinity of a ligand to a protein where it isalready known that the ligand successfully binds to the protein and theprotein/ligand complex has had its structure solved, the solvedstructure being present in the Protein Data Bank (“PDB”). Therefore, thescoring function has been developed with the benefit of known positivebinding data. In order to allow for discrimination between positive andnegative binders, a repulsion term must be added to the equation. Inaddition, a more satisfactory estimate of binding energy is achieved bycomputing the lipophilic interactions in a pairwise manner rather thanusing the area based energy term of the above Böhm functions.

Therefore, in a preferred embodiment, the binding energy is estimatedusing a modified Böhm scoring function. In the modified Böhm scoringfunction, the binding energy between protein and ligand (ΔG_(bind)) isestimated considering the following parameters: The reduction of bindingenergy due to the overall loss of translational and rotational entropyof the ligand (ΔG₀); contributions from ideal hydrogen bonds (ΔG_(hb))where at least one partner is neutral; contributions from unperturbedionic interactions (ΔG_(ionic)); lipophilic interactions betweenlipophilic ligand atoms and lipophilic acceptor atoms (ΔG_(lipo)); theloss of binding energy due to the freezing of internal degrees offreedom in the ligand, i.e., the freedom of rotation about each C—C bondis reduced (ΔG_(rot)); the energy of the interaction between the proteinand ligand (E_(VdW)). Consideration of these terms gives equation 1:(ΔG _(bind))=(ΔG ₀)+(ΔG _(hb) ×N _(hb))+(ΔG _(ionic) ×N _(ionic))+(ΔG_(lipo) ×N _(lipo))+(ΔG _(rot) +N _(rot))+(E _(VdW)).

Where N is the number of qualifying interactions for a specific termand, in one embodiment, ΔG₀, ΔG_(hb), ΔG_(ionic), ΔG_(lipo) and ΔG_(rot)are constants which are given the values: 5.4, −4.7, −4.7, −0.17, and1.4, respectively.

The term N_(hb) is calculated according to equation 2:N _(hb)Σ_(h-bonds) f(ΔR, Δα)×f(N _(neighb))×f _(pcs)

f(ΔR, Δα) is a penalty function which accounts for large deviations ofhydrogen bonds from ideality and is calculated according to equation 3:f(Δ  R, Δ − α) = f1(Δ  R) × f2(Δα) ${Where}\text{:}\begin{matrix}{{{f1}\left( {\Delta\quad R} \right)} = {{1\quad{if}\quad\Delta\quad R}<={TOL}}} \\{{{or} = {{1 - {{\left( {{\Delta\quad R} - {TOL}} \right)/0.4}\quad{if}\quad\Delta\quad R}}<={0.4 + {TOL}}}}\quad} \\{{or} = {{0\quad{if}\quad\Delta\quad R} > {0.4 + {TOL}}}}\end{matrix}$ ${And}\text{:}\begin{matrix}{{{f2}({\Delta\alpha})} = {{1\quad{if}\quad{\Delta\alpha}} < {30{^\circ}}}} \\{{or} = {{1 - {{\left( {{\Delta\alpha} - 30} \right)/50}\quad{if}\quad{\Delta\alpha}}}<={80{^\circ}}}} \\{{or} = {{0\quad{if}\quad{\Delta\alpha}} > {80{^\circ}}}}\end{matrix}$

TOL is the tolerated deviation in hydrogen bond length=0.25 Å

ΔR is the deviation of the H—O/N hydrogen bond length from the idealvalue=1.9 Å

Δα is the deviation of the hydrogen bond angle ∠_(N/O—H . . . O/N) fromits idealized value of 180°

f(N_(neighb)) distinguishes between concave and convex parts of aprotein surface and therefore assigns greater weight to polarinteractions found in pockets rather than those found at the proteinsurface. This function is calculated according to equation 4 below:f(N _(neighb))=(N _(neighb) /N _(neighb,0))^(α) where α=0.5

N_(neighb) is the number of non-hydrogen protein atoms that are closerthan 5 Å to any given protein atom.

N_(neighb,0) is a constant=25

f_(pcs) is a function which allows for the polar contact surface areaper hydrogen bond and therefore distinguishes between strong and weakhydrogen bonds and its value is determined according to the followingcriteria:f _(pcs)=β when A _(polar) /N _(HB)<10 Å²orf _(pcs) =l when A _(polar) /N _(HB)>10 Å²

A_(polar) is the size of the polar protein-ligand contact surface

N_(HB) is the number of hydrogen bonds

β is a constant whose value=1.2

For the implementation of the modified Böhm scoring function, thecontributions from ionic interactions, ΔG_(ionic), are computed in asimilar fashion to those from hydrogen bonds described above since thesame geometry dependency is assumed.

The term N_(lipo) is calculated according to equation 5 below:N _(lipo)=Σ_(lL) f(r _(lL))

f(r_(lL)) is calculated for all lipophilic ligand atoms, l, and alllipophilic protein atoms, L, according to the following criteria:f(r _(lL))=1 when r _(lL) <=R1f(r _(lL))=(r _(lL) −R1)/(R2−R1) when R2<r_(lL) >R1f(r _(lL))=0 when r _(lL) >=R2Where:R1=r _(l) ^(vdw) +r _(L) ^(vdw)+0.5andR2=R1+3.0

r_(l) ^(vdw) is the Van der Waal's radius of atom l

and r_(L) ^(vdw) is the Van der Waal's radius of atom L

The term N_(rot) is the number of rotable bonds of the amino acid sidechain and is taken to be the number of acyclic sp³-sp³ and sp³-sp²bonds. Rotations of terminal —CH₃ or —NH₃ are not taken into account.

The final term, E_(VdW), is calculated according to equation 6 below:E _(VdW)=ε₁ε₂((r ₁ ^(vdw) +r ₂ ^(vdw))¹² /r ¹²−(r ₁ ^(vdw) +r ₂ ^(vdw))⁶/r ⁶), where:

ε₁ and ε₂ are constants dependant upon atom identity

r₁ ^(vdw)+r₂ ^(vdw) are the Van der Waal's atomic radii

r is the distance between a pair of atoms.

With regard to Equation 6, in one embodiment, the constants ε₁ and ε₂are given the atom values: C: 0.245, N: 0.283, O: 0.316, S: 0.316,respectively (i.e. for atoms of Carbon, Nitrogen, Oxygen and Sulphur,respectively). With regards to equations 5 and 6, the Van der Waal'sradii are given the atom values C: 1.85, N: 1.75, O: 1.60, S: 2.00 Å.

It should be understood that all predetermined values and constantsgiven in the equations above are determined within the constraints ofcurrent understandings of protein ligand interactions with particularregard to the type of computation being undertaken herein. Therefore, itis possible that, as this scoring function is refined further, thesevalues and constants may change hence any suitable numerical value whichgives the desired results in terms of estimating the binding energy of aprotein to a ligand may be used and hence fall within the scope of thepresent invention.

As described above, the scoring function is applied to data extractedfrom the database of side-chain conformations, atom identities, andinteratomic distances. For the purposes of the present description, thenumber of MHC Class II molecules included in this database is 42 modelsplus four solved structures. It should be apparent from the abovedescriptions that the modular nature of the construction of thecomputational method of the present invention means that new models cansimply be added and scanned with the peptide backbone library andside-chain conformational search function to create additional data setswhich can be processed by the peptide scoring function as describedabove. This allows for the repertoire of scanned MHC Class II moleculesto easily be increased, or structures and associated data to be replacedif data are available to create more accurate models of the existingalleles.

The present prediction method can be calibrated against a data setcomprising a large number of peptides whose affinity for various MHCClass II molecules has previously been experimentally determined. Bycomparison of calculated versus experimental data, a cut of value can bedetermined above which it is known that all experimentally determinedT-cell epitopes are correctly predicted.

It should be understood that, although the above scoring function isrelatively simple compared to some sophisticated methodologies that areavailable, the calculations are performed extremely rapidly. It shouldalso be understood that the objective is not to calculate the truebinding energy per se for each peptide docked in the binding groove of aselected MHC Class II protein. The underlying objective is to obtaincomparative binding energy data as an aid to predicting the location ofT-cell epitopes based on the primary structure (i.e. amino acidsequence) of a selected protein. A relatively high binding energy or abinding energy above a selected threshold value would suggest thepresence of a T-cell epitope in the ligand. The ligand may then besubjected to at least one round of amino-acid substitution and thebinding energy recalculated. Due to the rapid nature of thecalculations, these manipulations of the peptide sequence can beperformed interactively within the program's user interface oncost-effectively available computer hardware. Major investment incomputer hardware is thus not required. It would be apparent to oneskilled in the art that other available software could be used for thesame purposes. In particular, more sophisticated software which iscapable of docking ligands into protein binding-sites may be used inconjunction with energy minimization. Examples of docking software are:DOCK (Kuntz et al., J. Mol. Biol., 161:269-288 (1982)), LUDI (Böhm, H.J., J. Comput Aided Mol. Des., 8:623-632 (1994)) and FLEXX (Rarey M., etal., ISMB, 3:300-308 (1995)). Examples of molecular modeling andmanipulation software include: AMBER (Tripos) and CHARMn (MolecularSimulations Inc.). The use of these computational methods would severelylimit the throughput of the method of this invention due to the lengthsof processing time required to make the necessary calculations. However,it is feasible that such methods could be used as a ‘secondary screen’to obtain more accurate calculations of binding energy for peptideswhich are found to be ‘positive binders’ via the method of the presentinvention.

The limitation of processing time for sophisticated molecular mechanicor molecular dynamic calculations is one which is defined both by thedesign of the software which makes these calculations and the currenttechnology limitations of computer hardware. It may be anticipated that,in the future, with the writing of more efficient code and thecontinuing increases in speed of computer processors, it may becomefeasible to make such calculations within a more manageable time-frame.

Further information on energy functions applied to macromolecules andconsideration of the various interactions that take place within afolded protein structure can be found in: Brooks, B. R., et al., J.Comput. Chem., 4:187-217 (1983) and further information concerninggeneral protein-ligand interactions can be found in: Dauber-Osguthorpeet al., Proteins 4(1):31-47(1988), which are incorporated herein byreference in their entirety. Useful background information can also befound, for example, in Fasman, G. D., ed., Prediction of ProteinStructure and the Principles of Protein Conformation, Plenum Press, NewYork, ISBN: 0-306 4313-9.

The following examples describe the invention in more detail.

EXAMPLE 2

The 165 amino acid sequence of INFα2 was analyzed in silico broadly bythe method of EXAMPLE 1. A panel of 57 13-mer synthetic peptides wereproduced and analyzed for their ability to bind in vitro with human MHCclass II molecules. The peptide sequences are depicted in FIG. 5.

MHC class II synthetic peptide binding assays were conducted using humanlymphoblastoid B cells of known HLA-DR allotype. Cells were fixed withparaformaldeyde and incubated with either biotinylated peptides alone orwith a non-biotinylated competitor peptide to determine IC₅₀ values.Following incubation with the peptides, cells were lysed and the MHCClass II molecules captured by the anti-HLA-DR α-chain monoclonalantibody LB3.1. Bound biotinylated peptide was detected by streptavidinperoxidase, and the amount of bound peptide quantitated by a luminescentread out.

Competition-binding assays were conducted, where non-biotinylated testpeptides were incubated with the fixed cells in the presence of thebiotinylated competitor peptide. Competitor peptides were previouslydetermined to have IC₅₀ values for the particular allotypes of interestusing a simple (non-competitive) binding assay. The IC₅₀ value is theconcentration of the unlabeled peptide that prevents 50% of the labelledpeptide from binding. The concentration of the biotinylated peptide wasdetermined experimentally to be at least one sixth of its measured ED₅₀(concentration of peptide that gives one half of the maximum response)for each allele, to ensure that the inhibition was primarily measuringthe binding characteristics of the competitor peptide.

EBV transformed human B lymphoblastoid cell lines are obtainable fromECACC (Salisbury, UK). HOM-2 cells were used in assays for DRB1*0101binding; WT51 cells were used in assays of DRB1*0401 binding and MOU(MANN) cells were used for assays of DRB1*0701 binding. The mousehybridoma LB3.1 was obtained from the American Tissue Culture CollectionATCC (Virginia, USA). Enhanced Chemiluminescent (ECL) reagent waspurchased from Amersham Pharmacia (Amersham, UK). RPMI 1640 medium,L-glutamine, and penicillin/streptomycin were obtained from LifeTechnologies (Paisley, UK). Optiplates™ were obtained from Packard(Pangbourne, England). Biotinylated peptides were obtained from BabrahamTech^(nix) (Cambridge, England) and non-biotinylated peptides fromPepscan Systems (Lelystad, The Netherlands). Prosep A was obtained fromMillipore (Watford, UK). DAB, PMSF, iodoacetamide, benzamidine,leupeptin, pepstatin, PBS tablets, DMSO, BSA, streptavidin peroxidaseconjugate and all other chemicals were obtained from The Sigma ChemicalCompany (Poole, UK).

Lymphoblastoid cells were cultivated in RPMI-1640 medium plus 10% foetalbovine serum (FBS), L-glutamine, and penicillin/streptomycin in ahumidified atmosphere at 37° C./5% CO₂.

LB3.1 hybridoma cells were cultivated in RPMI-1640 medium plus 10%foetal bovine serum (FBS), L-glutamine, and penicillin/streptomycin in ahumidified atmosphere at 37° C./5% CO2 and LB3.1 antibody purified fromthe culture supernatant. The supernatant was filtered using 0.22 μMfilters then 50 ml of 1M Tris pH 8.0 was added per 450 ml making a0.1M-buffered solution. The buffered supernatant was then passed througha 3 ml PROSEP A column overnight at 4° C. and washed with 25 ml of PBS.LB3.1 antibody was eluted using 8 ml of 0.1M citrate pH 3.0, and each0.5 ml fraction was collected into 500 μl 1M Tris-HCl pH 8.0. Theprotein content of each fraction was determined using aspectrophotometer (A280 nm). Fractions were pooled and dialysed into 800ml PBS, using a Slide-A-Lyser 3.5K cut-off (Pierce). Purity of the LB3.1was checked by reduced SDS-PAGE followed by Coomassie staining.

Binding assays for each peptide/allotype combination were conducted intriplicate in 96-well flat bottom Optiplates™ using 2×10⁶ cells perwell. Cells were washed twice with RPMI-1640 then fixed with 0.5%paraformaldehyde/PBS for 30 min on ice. After 2 washes with RPMI-1640the cells were incubated with either: biotinylated peptide; biotinylatedpeptide+non-biotinylated competitive peptide or no peptide. Incubationwas conducted using Peptide Binding Buffer (100 mM Citrate/Phosphate pH4.5, 5 mM EDTA, 1 mM PMSF, 100 μM Leupeptin, 1 mM Iodoacetamide, 100 μMPepstatin A, 1 mM Benzamidine) at 37° C. for 24 h. Cells were collectedby centrifugation, then 80 μl supernatant was removed and replaced with80 μl/well of NP40 lysis buffer (0.5% NP40, 150 mM NaCl, 1 mM PMSF, 100μM Leupeptin, 1 mM Iodoacetamide, 100 μM Pepstatin A, 1 mM Benzamidine50 mM Tris-HCl pH to 8.0). Cells were incubated at 4° C. for 45 minutesand a cleared lysate obtained by centrifugation. 50 μl(triplicate/sample) of lysate was added to each well of a pre-coated 96well assay plate containing 50 μl PBS/5% BSA. In all experimentspre-coating had been conducted the night before where assay plates hadbeen pre-treated at 4° C. with 100 μl/well anti-class II antibody LB3.1diluted to 20 μg/ml in PBS. Excess antibody was removed and the plateblocked with 250 μl/well of PBS/5% BSA for 2 hours at room temperature.Each plate was washed 7× with PBS/1% Tween and 50 μl of PBS/5% BSA/0.5%NP40 added to each well before addition of the cell lysates.

Following addition of the lysates, plates were incubated at 4° C. for 2hours, then washed 7× with PBS/0.1% Tween. 100 μl of StreptavidinPeroxidase diluted 1:1000 in PBS/5% BSA/ 0.1% Tween was added to eachwell and the plates incubated at 4° C. for 1 hour. Plates were washed 7×with PBS/0.1% Tween and 100 μl of chemiluminescent substrate (AmershamPharmacia) added to each well. Plates were read using a Perkin ElmerMicroBeta® TriLux plate reader and results were given in CPS.

In the competition analyses: maximum binding of biotinylated peptide wasdefined as the binding (CPS) occurring in the absence of the competitorpeptide, and inhibition by the formula:${\%\quad{Inhibition}} = {100 \times \frac{\left\lbrack {\left( {{CPS}\quad{with}\quad{no}\quad{competitor}} \right) - \left( {{CPS}\quad{with}\quad{competitor}} \right)} \right\rbrack}{\left\lbrack {{CPS}\quad{with}\quad{no}\quad{competitor}} \right\rbrack}}$

The concentration of competitor peptide causing 50% inhibition ofmaximun biotinylated peptide binding was taken as the IC₅₀.

The binding assays conducted on the panel of 13-mer peptides as listedin FIG. 5 are depicted in FIG. 6 a-d. With the exception of peptidesshown in FIG. 6 d, all peptides indicate a binding interaction with oneor more of the human MHC class II allotypes tested.

EXAMPLE 3

The interaction between MHC, peptide and T-cell receptor (TCR) providesthe structural basis for the antigen specificity of T-cell recognition.T-cell proliferation assays test the binding of peptides to MHC and therecognition of MHC/peptide complexes by the TCR. In vitro T-cellproliferation assays of the present example, involve the stimulation ofperipheral blood mononuclear cells (PBMCs), containing antigenpresenting cells (APCs) and T-cells. Stimulation is conducted in vitrousing synthetic peptide antigens, and in some experiments whole proteinantigen. Stimulated T-cell proliferation is measured using ³H-thymidine(³H-Thy) and the presence of incorporated ³H-Thy assessed usingscintillation counting of washed fixed cells.

Buffy coats from human blood stored for less than 12 hours were obtainedfrom the National Blood Service (Addenbrooks Hospital, Cambridge, UK).Ficoll-paque was obtained from Amersham Pharmacia Biotech (Amersham,UK). Serum free AIM V media for the culture of primary human lymphocytesand containing L-glutamine, 50 82 g/ml streptomycin, 10 μg/ml gentomycinand 0.1% human serum albumin was from Gibco-BRL (Paisley, UK). Syntheticpeptides were obtained from Eurosequence (Groningen, The Netherlands)and Babraham Technix (Cambridge, UK).

Erythrocytes and leukocytes were separated from plasma and platelets bygentle centrifugation of buffy coats. The top phase (containing plasmaand platelets) was removed and discarded. Erythrocytes and leukocyteswere diluted 1:1 in phosphate buffered saline (PBS) before layering onto15 ml ficoll-paque (Amersham Pharmacia, Amersham UK). Centrifugation wasdone according to the manufacturers recommended conditions PBMCs wereharvested from the serum+PBS/ficoll paque interface. PBMCs were mixedwith PBS (1:1) and collected by centrifugation. The supernatant wasremoved and discarded and the PBMC pellet resuspended in 50 ml PBS.Cells were again pelleted by centrifugation and the PBS supernatantdiscarded. Cells were resuspended using 50 ml AIM V media and at thispoint counted and viability assessed using trypan blue dye exclusion.Cells were again collected by centrifugation and the supernatantdiscarded. Cells were resuspended for cryogenic storage at a density of3×10⁷ per ml. The storage medium was 90% (v/v) heat inactivated AB humanserum (Sigma, Poole, UK) and 10% (v/v) DMSO (Sigma, Poole, UK). Cellswere transferred to a regulated freezing container (Sigma) and placed at−70° C. overnight. When required for use, cells were thawed rapidly in awater bath at 37° C. before transferring to 10 ml pre-warmed AIM Vmedium.

PBMC were stimulated with protein and peptide antigens in a 96 well flatbottom plate at a density of 2×10⁵ PBMC per well. PBMC were incubatedfor 7 days at 37° C. before pulsing with ³H-Thy (Amersham-Phamacia,Amersham, UK). For the present study, synthetic peptides (15 mers) thatoverlapped by 3aa increments were generated that spanned the entiresequence of IFNα. Peptide identification numbers (ID#) and sequences aregiven in FIG. 7. Each peptide was screened individually against PBMC'sisolated from 20 naïve donors. Two control peptides that have previouslybeen shown to be immunogenic and a potent non-recall antigen KLH wereused in each donor assay.

The control antigens used in this study were as below: Peptide SequenceC-32 Biotin-PKYVKQNTLKLAT Flu haemagglutinin 307-319 C-49KVVDQIKKISKPVQH Chlamydia HSP 60 peptide KLH Whole protein from KeyholeLimpet Hemocyanin.

Peptides were dissolved in DMSO to a final concentration of 10 mM, thesestock solutions were then diluted 1/500 in AIM V media (finalconcentration 20 μM). Peptides were added to a flat bottom 96 well plateto give a final concentration of 2 and 20 μM in a 100 μl. The viabilityof thawed PBMC's was assessed by trypan blue-dye exclusion, cells werethen resuspended at a density of 2×10⁶ cells/ml, and 100 μl (2×10⁵PBMC/well) was transferred to each well containing peptides. Triplicatewell cultures were assayed at each peptide concentration. Plates wereincubated for 7 days in a humidified atmosphere of 5% CO₂ at 37° C.Cells were pulsed for 18-21 hours with 1 μCi ³H-Thy/well beforeharvesting onto filter mats. CPM values were determined using a Wallacmicroplate beta top plate counter (Perkin Elmer). Results were expressedas stimulation indices, determined using the following formula:Proliferation to test peptide CPM/Proliferation in untreated wells CPM

A stimulation index of 2 or greater was taken as positive stimulation inthis assay. Mapping T cell epitopes in the IFNα sequence using the Tcell proliferation assay resulted in the identification of threeimmunogenic regions R1, R2, R3. This was determined by T cellproliferation in 6 donors that responded to peptides in one or more ofthese regions. Region 3 is considered to contain a potentialimmunodominant T-cell epitope as proliferation is scored in 4 of 6donors that responded to IFNα peptides. Regions 1 and 2 induce T-cellproliferation in certain individuals. The cumulative response data forthe responding individuals is depicted in FIG. 8, and data fromindividual responders is summarized in FIG. 9. The stimulation index forindividual donors is shown in FIG. 15. The epitope data for INFα andindicating R1, R2 and R3 together with the individual peptide/donorresponses is depicted in FIG. 10.

EXAMPLE 4

The tissue types for all PBMC samples used in EXAMPLE 3 were assayedusing a commercially available reagent system (Dynal, Wirral, UK).Assays were conducted in accordance with the suppliers recommendedprotocols and standard ancillary reagents and agarose electrophoresissystems. Allotypic coverage for DRB1 alleles was 70% in the 20 donorstested. Results of the tissue typing were used to assess the frequencyof INFα2 peptide responders carrying specific MHC class II alleles.Allotypic restriction of a given peptide is determined by the frequencyof an allele in the donor population and the number of responding donorsthat express the same allele. If a peptide is associated with anyparticular allele then the frequency (expressed as a percentage) isexpected to be greater than the frequency of the allele in thepopulation. Results of such an analysis is given in FIG. 11. In generalthe small numbers preclude rigorous statistical examination, however thedata indicate a possible association of peptides from the epitope regiondefined as R3 with the DRB4*01 allotype

EXAMPLE 5

MHC Peptide binding assays were conducted using synthetic peptidescontaining sequences derived from the major immunogenic regionsidentified using the biological assay of EXAMPLE 3. In these assayssynthetic 15-mer peptides were tested for their ability to bind threeMHC allotypes in competition with a biotinylated competitor peptide.Assays were conducted broadly as detailed in EXAMPLE 2 and IC₅₀ valuescalculated from binding curves derived from six concentration ratios ofcompetitor to test peptide. The IC₅₀ values for each peptide/allotypecombination tested are shown in FIG. 12 a-12 c. These data indicate thatpeptides capable of stimulating T-cell proliferation in an in vitrobiological assay may be of low or high affinity MHC class II ligands.

EXAMPLE 6

A number of modified IFNα2 molecules were made using conventionalrecombinant DNA techniques. A wild-type INFα2b gene was cloned fromhuman placental DNA and the gene was used both as a control reagent, anda template from which to derive modified INFα2b genes by site-directedmutagenesis. Wild-type and modified genes were inserted into aeukaryotic expression vector and the recombinant INFα2 proteinsexpressed as fusion protein with the human immunoglobulin constantregion domain. Recombinant proteins were prepared from transientlytransfected human embryonic kidney cells and assayed as detailed inEXAMPLE 7

Briefly, the wild-type INFα2b gene was amplified from human placentalDNA (Sigma, Poole, UK) using the polymerase chain reaction (PCR). Thegene contains no introns and was readily amplified using forward andreverse primers OL177 and OL178 containing restriction sites tofacilitate cloning as given below: OL177(EcoRI)5′CCGGAATTCGCTAGCTGCCCAGCCGGCGATGGCCTGTGATCTGCCTCA AACCCACAGCC-3′ OL178(XhoI/BamHI) 5′-CCGGGATCCCTCGAGCTATTATTCCTTACTTCTTAAACTTTCTTGCA AG-3′

The PCR product of 550 bp was digested with EcoRI and BamHI and clonedinto the pLITMUS28 vector (NEB, LUK Ltd.). The sequence was confirmed tobe that of interferon alpha 2b by analysis of a number of positiveclones. In order to obtain expression from human embryonic kidney cells,the wild-type gene was re-cloned into vector pd-Cs [Lo, et al (1998),Protein Engineering 11: 495]. The pd-Cs vector directs the expression ofa fusion protein containing the human immunoglobulin constant regiondomain. Cloning to this vector was achieved using PCR and primers OL232and OL178. These primers provide cloning sites for use with enzymes XmaIand BamHI as below: OL232 (XmaI)5′ CTGTCCCCGGGTAAATGTGATCTGCCTCAGACCCACAGCC 3′ OL178 (XhoI/BamHI)5′-CCGGGATCCCTCGAGCTATTATTCCTTACTTCTTAAACTTTCTTGCA AG-3′

The PCR product of 530 bp was digested with XmaI and BamHI, purifiedusing a Qiagen gel extraction kit and transferred into prepared pd-Csfrom which the IFN(L) sequence had been removed using XmaI and BamHI. Apositive clone was selected and the INFα2b sequence confirmed bysequence analysis. The pd-Cs vector containing the wild-type INFα2b genewas termed pCIFN5.

Single or multiple codon mutations to generate modified INFα2 genes isconducted by mutagenic PCR using pCIFN5 as a template. Overlap PCR wasused to combine the two mutated halves of the interferon sequence. Thisfragment is then cloned into an intermediate vector (pGEM-T EASY vector;Promega, UK) for sequence analysis prior to being transferred into thepd-Cs derived expression vector using XmaI and BamHI as described above.

Mutagenesis was conducted using flanking primers OL235 and OL234 inseparate reactions in combination with specific mutagenic (mis-matched)primers and the pCIFN5 template DNA. OL234: 5′-CTCATGCTCCGTGATGCATGAGGCOL235: 5′-CACTGCATTCTAGTTGTGGTTTGTC

Reactions were conducted using Expand IH Fidelity PCR reagents(Roche,GmbH) and reaction conditions specified by the following cycle:94° C./2′+25 Cycles@94° C./30″, 60° C./30″, 72° C./30″+72° C./10″

The products of the separate reactions were joined by PCR in a reactiondriven by primers OL235 and OL234 using 15 cycles of PCR as above.

PCR products were gel purified using commercially available kit systems(Qiagen gel extraction kit). The products were cloned using a T/Acloning system into vector pGEM-T EASY (Promega, UK) and a number ofclones were sequenced in each case to confirm the successfulintroduction of the desired mutation.

The desired clones were digested with BamH1 and Xma1 and the purifiedproduct ligated into a prepared pd-Cs vector. Cloning was conductedusing E. coli XL1-Blue cells (Strategene Europe) and culture conditionsrecommended by the supplier. Sequence confirmation was conducted on allfinal vector preparations using OL261 and OL234 as sequencing primers.OL261 5′-GGTGACAGAGACTCCCCTGATGAAG 3′ OL234: 5′-CTCATGCTCCGTGATGCATGAGGC3′

Expression of modified INFα2 human IgFC fusion proteins was achievedusing HEK293 human embryonic kidney cell line as the expression host.All DNA for transfection was prepared using the high purity CONCERTmidiprep system and instructions provided by the supplier (Invitrogen,Paisley, UK). DNA is filter sterilised prior to use and quantified bymeasurment of the A₂₆₀. Concentrations were adjusted to 0.5-1.0 μg/μl.

For transient expression, HEK293 were grown using D-MEM glutamax medium(Invitrogen, Paisley, UK) supplemented with 10% FCS and 250 μg/mlgeneticin. Prior to transfection, cells were collected by treatment withtrypsin and washed using PBS. After 2 cycles of washing cells are takeninto fresh medium at a density of 4×10⁵ cells/ml, and plated intomultiwell dishes pre-treated with poly-l-lysine to ensure good celladhesion. Typically, 2×10⁵ cells are added to each well of a 48 wellplate and the plates incubated overnight at 37° C./5% CO₂.

Prior to transfection, the medium is replaced in each well and thetransfection mixes added. Transfection is conducted using thelipofectamine reagent and instructions provided by the supplier(Invitrogen, Paisley, UK). Briefly, transfection mixes are preparedcontaining lipofectamine, OPTI-MEM (Invitrogen, Paisley, UK) and 0.8 μgDNA per well for each expression vector construct. Transfection mixesare added to the cells and the cells incubated for 4-6 hours. The mediumis replaced with 0.5 ml fresh media and the cells incubated at 37° C./5%CO₂. Samples were taken after 48 hours for analysis by both anti-FCELISA and Daudi cell proliferation assay. The media was harvested after7 days and stored at 4 C for further analysis as above.

The medium is assayed for the presence of INFα2 using a commerciallyavailable ELISA system and instructions provided by the supplier (R&Dsystems, UK). In some instances an ELISA detecting the humanimmunglobulin constant region domain of the IFNα-fusion protein wasapplied. For this assay a mouse anti-human IgG Fc preparation (Sigma,Poole, UK) is used as a capture reagent. The INFa-HuFc fusion isquantitated with reference to a standard curve generated using adilution series of a reference human IgG preparation (Sigma). BoundINFα-FC fusion or the reference protein is detected using an anti-humanIgG peroxidase conjugate (Sigma) and Sigma OPD colourimetric substrate.

Following estimation of the amount of INFα in the HEK293 conditionedmedium, the conditioned medium is used directly to test the functionalactivity of the modified INFα using the anti-proliferation assay asdetailed in EXAMPLE 7.

EXAMPLE 7

Modified interferon molecules of the present invention were tested fortheir ability to inhibit the growth of human B cell lymphoma line Daudi.The method is broadly as described previously [Mark, D. F. et al (1984)Proc. Natl. Acad. Sci. USA 81: 5662-5666] and involves incubation ofDaudi cells with the test interferon. The anti-proliferative effect ofthe test molecule is measured using a soluble dye substance whichundergoes a colour change in the presence of proliferating cells. Theinduced colour change is measured in a spectrophotometer and anyantiproliferative effect is computed with reference to the colour changerecorded in non-treated control cells and cells treated with a standardinterferon preparation.

Briefly, Daudi cells (ATCC # CCL-213) were cultured RPMI 1640 Mediasupplemented with 100 units/ml Penicillin/100 ug/ml Streptomycin and 2mM L-Glutamine and 20% Fetal Bovine Serum (FBS). All media andsupplements were from Gibco (Paisley, UK). The day before assay, cellsare replaced into fresh medium at a density 0.9×10⁶/ml and next dayreplaced into fresh medium as above except containing 10% (v/v) FBS. Thecell density is adjusted to be 2×10⁵ cells/ml.

The test and control interferon preparations are diluted into RPMIcontaining 10% FBS. Dilutions are made into 96-well flat bottom platesto contain 100 ul/well and all samples are set up in triplicate.Typically doubling dilution series are set out across each plate.Positive control wells are also included in triplicate with a startingconcentration of the interferon standard (NIBSC, South Mimms, UK) at10000 pg/ml. Control wells containing 100 ul media alone (no interferon)are also included. 100 ul of the cells are added to each well, and theplates incubated for 72 hours at 37° C., 5% CO₂.

Proliferation is assessed using Aqueous One reagent system and thesuppliers recommended protocol (Promega, Southampton, UK). Briefly, 40μl of the Aqueous One reagent is added to all wells and the substratemixed. Plates are incubated at 37° C. for one hour, and then transferredto the plate reading instrument for determination of the lightabsorbance. Readings are taken at 490 nm. Average absorbance at 490 nmis plotted on the Y axis versus concentrations of interferon standardadded along the X axis. Interferon concentration is determined using aELISA techniques as detailed in EXAMPLE 6. The resulting calibrationcurve is used to determine the ED₅₀ value for each test sample.

Results of such an analysis according to the above method for a numberof modified INFα2 molecules are depicted in FIG. 14. The resultsindicate retained anti-proliferative properties in the presence of aminoacid substitutions within the INFα2 sequence.

1. A modified molecule having the biological activity of human interferon alpha 2 (INFα2) and being substantially non-immunogenic or less immunogenic than any non-modified molecule having the same biological activity when used in vivo.
 2. A molecule according to claim 1, wherein said loss of immunogenicity is achieved by removing one or more T-cell epitopes derived from the originally non-modified molecule and/or by reduction in numbers of MHC allotypes able to bind peptides derived from said molecule.
 3. A molecule according to claim 2, wherein one T-cell epitope is removed. 4-70. (canceled)
 71. A protein that is homologous to human interferon alpha 2 (INFα2), the human interferon alpha 2 having an amino acid sequence (SEQ ID NO: 1) that includes at least one T-cell epitope; the protein having substantially the same amino acid sequence as SEQ ID NO: 1, but including at least one less T-cell epitope; wherein the protein has substantially the same biological activity as human interferon alpha 2, but is less immunogenic than said human interferon alpha 2 when both are exposed to the immune system of the same species.
 72. The protein of claim 71 wherein the amino acid sequence of the protein includes one less T-cell epitope.
 73. The protein of claim 71 wherein the amino acid sequence of the protein differs from SEQ ID NO: 1 by one to nine amino acid residues.
 74. The protein of claim 71 wherein the amino acid sequence of the protein has at least one less amino acid residue than SEQ ID NO:
 1. 75. The protein of claim 71 wherein the amino acid sequence of the protein has at least one more amino acid residue than SEQ ID NO:
 1. 76. The protein of claim 71 wherein the amino acid sequence of the protein has the same number of amino acid residues as SEQ ID NO:
 1. 77. A pharmaceutical composition comprising a protein of claim 71 and a pharmaceutically acceptable carrier therefor.
 78. A method of preparing a protein of claim 71, the method comprising the steps of: (i) identifying one or more potential T-cell epitopes within the amino acid sequence of human interferon alpha 2 (SEQ ID NO: 1); (ii) selecting at least one sequence variant of at least one potential T-cell epitope identified in step (i) that eliminates or substantially reduces the MHC class II binding activity of the potential T-cell epitope; wherein the amino acid sequence of the selected variant differs from the amino acid sequence of the T-cell epitope identified in step (i) by at least one amino acid residue; (iii) preparing, by recombinant DNA techniques, at least one protein that includes at least one variant selected in step (ii); (iv) evaluating the biological activity and immunogenicity of at least one protein prepared in step (iii); and (v) selecting a protein evaluated in step (iv) that has substantially the same biological activity as, but substantially less immunogenicity than human interferon alpha
 2. 79. The method of claim 78 wherein step (i) is carried out by determining the MHC class II binding affinity of potential T-cell epitope segments of human interferon alpha 2 using an in vitro assay, an in silico technique, or a biological assay.
 80. The method of claim 78 wherein step (i) is carried out by: (a) selecting a region of the amino acid sequence of human interferon alpha 2 (SEQ ID NO: 1); (b) sequentially sampling overlapping amino acid residue segments of predetermined uniform size and including at least three amino acid residues from the selected region; (c) calculating the MHC class II molecule binding score for each of the sampled segments by summing assigned values for each hydrophobic amino acid residue side chain present in the sampled amino acid residue segment; and (d) identifying at least one segment that is suitable for modification based on the calculated MHC class II binding score for that segment to reduce the overall MHC class II binding score for the protein relative to the binding score for human interferon alpha
 2. 81. The method of claim 80 wherein step (c) is carried out by using a Böhm scoring function modified to include a van der Waal's ligand-protein energy repulsive term and a ligand conformational energy term by: (1) selecting a model from a first database of MHC class II molecule models; (2) selecting an allowed peptide backbone from a second database of allowed peptide backbones for the MHC class II molecule models in step (1); (3) identifying amino acid residue side chains present in each sampled segment; (4) determining the binding affinity value for all side chains present in each sampled segment; and (5) repeating each of steps (1) through (4) for each model in the first database and for each backbone in the second database.
 82. The method of claim 78 wherein step (ii) is carried out by substitution, addition, or deletion of one to nine amino acid residues from a potential T-cell epitope identified in step (i).
 83. A protein of claim 71 having an amino acid sequence that is free from T-cell epitopes.
 84. A protein having the following amino acid sequence (SEQ ID NO: 5): CDLPQTHSLGSRRTLMLLAQMRX⁰ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDLEACVIQGVGVTETPLMKEDSILAVRKX¹X²QRX³TX4YLKEKKYSPCAWEVVRAEIMRSFSLSTNLQESLRSKE, wherein X⁰ is R or K; X¹ is Y, E, or Q; X² is F or H; X³ is I or A; and X⁴ is L or A; excluding proteins having amino acid sequences in which, simultaneously, X¹ is Y, X² is F, X³ is I, and X⁴ is L.
 85. A protein having the following amino acid sequence (SEQ ID NO: 6): CDLPQTHSLGSRRTLMLLAQMRX⁰ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDLEACVIQGVGVTETPX¹X2KEDSX³X⁴AVRKX⁵X⁶QRX⁷TX⁸YLKEKKYSPCAWEVVRAEIMRSFSX⁹STNLQESLRSKE, wherein X⁰ is R or K; X¹ is L, S, or G; X² is M, T, S, or E; X³ is I, S, or Q; X⁴ is L or G; X⁵ is Y, E, or Q; X⁶ is F or H; X⁷ is I or A; X⁸ is L or A; and X⁹ is L or S; excluding proteins having amino acid sequences in which, simultaneously, X¹ is L; X² is M; X³ is; X⁴ is L; X⁵ is Y; X⁶ is F; X⁷ is I; X⁸ is L; and X⁹ is L.
 86. A protein having the following amino acid sequence (SEQ ID NO: 7): CDLPQTHSLGSRRTLMLLAQMRX0ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQX1X2NX3X4STKDSSAAX5DETLLDKX6X7TELX8QQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVRAEIMRSFSLSTNLQESLRSKE, wherein X⁰ is R or K; X¹ is 1 or T; X² is F, D, or A; X³ is L or A; X⁴ is F, D, or E; X⁵ is W or H; X⁶ is F, D, or E; X⁷ is Y or S; and X⁸ is Y, D, E, or N; excluding proteins having amino acid sequences in which, simultaneously, X¹ is I; X² is F; X³ is L; X⁴is F; X⁵ is W; X⁶ is F; X⁷ is Y; and X⁸ is Y.
 87. A protein having the following amino acid sequence (SEQ ID NO: 8): CDLPQTHSLGSRRTLMLLAQMRX⁰ISLFSCLKDRHDFGFPQEEFGNQFQKAETIPVLHEMIQQX¹FNLFSTKDSSAAWDETLLDKFX²TELX³QQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVRAEIMRSFSLSTNLQESLRSKE, wherein X⁰ is R or K; X¹ is I or T; X² is Y or S and X³ is Y, D, E, or N; excluding proteins having amino acid sequences in which, simultaneously, X¹ is I; X² is Y; and X³ is Y.
 88. A protein having the following amino acid sequence (SEQ ID NO: 9): CDLPQTHSLGSRRTLMLLAQMRX⁰ISX¹X²SCLKDRHDFGX³PQEEFGNQFQKAETIPX⁴LHEMIQQIFNLFSTKDSSAAWDETLLDKFYTELYQQLNDLEACVIQGVGVTETPLMKEDSILAVRKYFQRITLYLKEKKYSPCAWEVVRAEIMRSFSLSTNLQESLRSKE, wherein X⁰ is R or K; X¹ is L or P; X² is F or S; X³ is F or E; and X⁴ is V or A; excluding proteins having amino acid sequences in which, simultaneously, X¹ is L; X² is F; X³ is F; and X⁴ is V.
 89. An isolated polypeptide selected from the group of polypeptides set forth in FIG.
 1. 90. An isolated polypeptide selected from the group of polypeptides set forth in FIG.
 7. 91. An isolated polynucleotide encoding a protein of claim
 71. 92. An isolated polynucleotide encoding a protein of claim
 84. 93. An isolated polynucleotide encoding a protein of claim
 85. 94. An isolated polynucleotide encoding a protein of claim
 86. 95. An isolated polynucleotide encoding a protein of claim
 87. 96. An isolated polynucleotide encoding a protein of claim
 88. 97. An isolated polynucleotide encoding a polypeptide of claim
 89. 98. An isolated polynucleotide encoding a polypeptide of claim
 90. 99. A plasmid comprising a polynucleotide of claim
 91. 100. A plasmid comprising a polynucleotide of claim
 92. 101. A plasmid comprising a polynucleotide of claim
 93. 102. A plasmid comprising a polynucleotide of claim
 94. 103. A plasmid comprising a polynucleotide of claim
 95. 104. A plasmid comprising a polynucleotide of claim
 96. 105. A plasmid comprising a polynucleotide of claim
 97. 106. A plasmid comprising a polynucleotide of claim
 98. 